Blind Source Separation of Speech Mixtures using a Simple and Computationally Efficient Time-Frequency Approach
نویسندگان
چکیده
A very simple and extremely computationally efficient algorithm for blind separation of two speech sources from two mixtures is presented in this paper. The algorithm exploits the approximate W-disjoint orthogonality of speech signals and assumes specific sensors (microphones) setting that allows the sources to possess a feature we call cross high-low diversity. Two sources are said to be cross high-low diverse (CH-LD) if the two sources are not both close to the same sensor. A source is said to be close to a sensor, if its energy at that sensor is higher than its energy at the other sensor. With this assumption and the W-disjoint orthogonality, it was found that a speech source can easily be extracted from any of the two mixtures with good SIRs (signal-to-interference ratios) based on simple algorithm that compares the ratios of the magnitudes of the time-frequency representations of the two mixtures. The proposed algorithm was tested using different mixtures and has proved to be efficient with both instantaneous and echoic real mixtures. Finally, performance optimization and future expendability to non-CH-LD sources was found possible.
منابع مشابه
From Blind Source Separation to Blind Source Cancellation in the Underdetermined Case: a New Approach Based on Time-frequency Analysis
Many source separation methods are restricted to non-Gaussian, stationary and independent sources. This yields some problems in real applications where the sources often do not match these hypotheses. Moreover, in some cases we are dealing with more sources than available observations which is critical for most classical source separation approaches. In this paper, we propose a new simple sourc...
متن کاملPhase Aliasing Correction For Robust Blind Source Separation Using DUET
Degenerate Unmixing Estimation Technique (DUET) is a technique for blind source separation (BSS). Unlike the ICA based BSS techniques, DUET is a time-frequency scheme that relies on the socalled W-disjoint orthogonality (WDO) property of the source signals, which states that the windowed Fourier transforms of different source signals have statistically disjoint supports. In addition to being co...
متن کاملBlind Source Separation of Convolutive Mixtures of Speech in Frequency Domain
This paper overviews a total solution for frequencydomain blind source separation (BSS) of convolutive mixtures of audio signals, especially speech. Frequency-domain BSS performs independent component analysis (ICA) in each frequency bin, and this is more efficient than time-domain BSS. We describe a sophisticated total solution for frequency-domain BSS, including permutation, scaling, circular...
متن کاملSubband-Based Blind Separation for Convolutive Mixtures of Speech
We propose utilizing subband-based blind source separation (BSS) for convolutive mixtures of speech. This is motivated by the drawback of frequency-domain BSS, i.e., when a long frame with a fixed long frame-shift is used to cover reverberation, the number of samples in each frequency decreases and the separation performance is degraded. In subband BSS, (1) by using a moderate number of subband...
متن کامل8 The DUET Blind Source Separation
This chapter presents a tutorial on the DUET Blind Source Separation method which can separate any number of sources using only two mixtures. The method is valid when sources are W-disjoint orthogonal, that is, when the supports of the windowed Fourier transform of the signals in the mixture are disjoint. For anechoic mixtures of attenuated and delayed sources, the method allows one to estimate...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006